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EFFICIENT LINKING OF NUCLEIC ACID SEGMENTS 
BACKGROUND OF THE INVENTION 

Technical Field 

This invention relates to a method for the 
efficient amplification of target DNA segments which is 
particularly, advantageous for those target genes 
containing many small exons. Specifically, the method 
involves a rapid polymerase chain reaction technique 
for linking multiple gene segments from a single gene 
or multiple genes into a large DNA molecule suitable 
for further analysis. 

Description of the Background Art 

The technique known as the polymerase chain 
reaction (PGR) is a method of amplification of genomic 
DNA (Saiki et al., Science: 1350-1354 (1985)). 
Typicallyr the method uses two oligonucleotide primers 
to amplify a single DNA segment millions of times. 
Each cycle of DNA replication from the original 
primer (s) produces a product which serves as a template 
for further primer-dependent replication. This feature 
of the method results in exponential increases in the 
desired DNA product with each round of amplification, 
and a rapid accumulation of DNA. Now an automated 
technique performed using a thermal cycler and 
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thermostable DNA polymerases, PGR is widely used by 
molecular biologists to prepare large amounts of 
synthetic DNA. 

This revolutionary technique has been used in a 
wide variety of fields in molecular biology, and has 
made possible the rapid identification of disease- 
associated genes. Using the PGR, it has become 
feasible to diagnose inherited disorders and 
susceptibility to disease at the molecular level. 
Nevertheless, several disadvantages and limitations are 
recognized in the technique and its application to 
certain genes. 

Customarily, when using PGR to amplify genomic 
DNA, each gene segment is amplified separately and then 
analyzed. When the gene of interest contains more than 
one exon, each exon must be amplified individually 
using a separate PGR, and then linked together to form 
a long DNA molecule representing the entire gene (Ho, 
et al.r 1989; Horton, et al., 1989). Because no more 
than two gene segments can be linked together in each 
joining PGR, the more exons the gene of interest 
contains, the more separate amplifications must be 
individually performed and the more PGRs are needed to 
link the segments together. For genes which contain a 
lot of small exons, the preparation of a single gene 
for analysis can become prohibitively labor-intensive, 
and require a great deal of time, particularly when the 
small target exons must be individually scanned for 
mutations or polymorphisms. 

It has been shown that simultaneous amplification 
of more than one DNA segment can be achieved with a 
Multiplex PGR using primers tagged with an unrelated 20 
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nucleotide sequence from bacteriophage M13mpl8 (Shuber 
et al,f 1995). However, with this method, products 
amplified with primers lacking the 20 nucleotide 
sequence were not reliably produced due to differences 
in hybridization kinetics among the primers. Using the 
prior art method, therefore, tagging each primer with 
an identical 20 nucleotide sequence is necessary to 
achieve efficient amplification of multiple sequences. 
This prior art method thus allows multiple 
amplifications, but the products of the amplification 
all contain identical unrelated sequences which would 
have to be removed or extended before they could be 
linked to form one, long DNA molecule containing all 
portions of the gene of interest. 

After the individual gene segments have been 
separately amplified, further independent steps are 
needed to reconstruct the complete desired gene 
sequence from the smaller segments, with or without an 
introduced mutation, before the entire gene is ready 
for analysis- Prior art methods for linking sections 
of DNA using PGR involve the joining of two segments at 
a time, each PGR followed by a purification step. (Ho 
et al., 1989; Kim et ai., 1996). The joining of 
several gene segments together,, therefore, requires 
multiple PCRs and multiple purification steps. For 
example, joining four exons to form one complete gene 
using these prior art methods would require four 
separate amplifying PCRs, four separate purifications 
of the products and three joining PGRs. Each 
additional DNA segment in the gene would require an 
additional PGR to link it to the others, increasing 
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both the time and expense of preparing the DNA for 
analysis using prior art methods. 

In sunmiary, the prior art methods of producing an 
amplified copy of an entire gene composed of multiple 
linked exons or an amplified copy of a long DNA 
molecule composed of gene segments from more than one 
gene have the disadvantage that multiple PCRs are 
generally required at each step. Previously, . it has not 
been possible to efficiently amplify multiple gene 
segments in a single PGR to yield products that had 
complementary ends suitable for easy, rapid linkage in 
a second single PGR. 

Consequently, there has been a need in the field 
for a simple and rapid method allowing amplification of 
DNA which contains several linked DNA segments which 
occur in non-adjacent portions of target DNA. There is 
a need for a method which can produce an amplified DNA 
molecule containing an entire gene of linked exons from 
genomic DNA, e.g., for DNA diagnosis. 

SUMMARY OF THE INVENTION 

The present invention provides a method of linking 
by PGR DNA segments which occur in non-adjacent 
portions of target DNA wherein each DNA segment 
contains a sequence complementary to a sequence in the 
DNA segment or segments to which it is to be linked, 
comprising using a) a first primer which is 
complementary to the antisense strand of the first DNA 
segment to be linked and a second primer which is 
complementary to the sense strand of the last DNA 
segment to be linked; and b) at least one polymerase 
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lacking 3'-»5' exonuclease activity and at least one 
polymerase containing 3* -^5' exonuclease activity. 

In addition, the present invention provides a 
method of producing and amplifying DNA containing at 
least three linked DNA segments which occur in non- 
adjacent portions of target DNA, comprising a) 
providing a first primer and a second primer for each 
DNA segment to be amplified, i) the first primer 
(termed the D primer) having a 3* portion which is 
complementary to the 3' end of the antisense strand of 
the DNA segment and a 5' tail which is complementary to 
the 5' end of the second primer for the previous DNA 
segment or to a sequence internal to the previous DNA 
segment; ii) the second primer (termed the U primer) 
having a 3' portion which is complementary to the 3' 
end of the sense strand of the DNA segment and a 5' 
tail which is complementary to the 5' end of the first 
primer for the subsequent segment or to a sequence 
internal to the subsequent DNA segment; b) amplifying 
the at least three DNA segments by multiplex PGR using 
the pairs of first and second primers; and c) 
subjecting the at least three amplified DNA segments to 
a linking PGR using a sense primer which is 
complementary to the antisense strand of the first 
segment to be linked and an antisense primer which is 
complementary to the sense strand of the last segment 
to be. linked. 
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BRIEF DESCRIPTION OF THE DRAWINGS 

Figure lA is a schematic representation of the 
Multiplex and Linking PGR steps of the inventive 
process. Four regions of the p53 gene were amplified 
by Multiplex PGR with the four primer pairs Di/Uj, D2/U2, 
D3/U3 and D4/U4. Each primer contains a GC-rich tail and 
a sequence-specific region. The tail of a U primer is 
complementary to the tail of subsequent D primer. 
After a simple purification step, the four PGR 
amplified DNAs {D-^U^, D2V2, D3U3 and D4U4) are linked and 
amplified by nested P and Q primers. 

Figure IB provides an example showing two types of • 
tails which can be used with the inventive process. 
The type I tail of D3 primers is not overlapped with the 
sequence-specific region of U2 primer, while the Type II 
tail is overlapped by 4 bases. Sense (SEQ ID NO: 21) 
and antisense (SEQ ID NO:22) sequences of the p53 gene, 
and sense (SEQ ID NO:23) and antisense {SEQ ID NO:24) 
sequences of the F9 gene are included in the tail. 

Figure IC. This schematic diagram illustrates the 
use of the type III tail, in which the tail sequence is 
complementary to an internal portion of the DNA 
sequence to be joined. 

Figure 2A shows the relative yields of PGR product 
of the p53 gene with varied amounts of Vent and fixed 
amounts of Tth, Tag, or Tfl enzymes. Fixed amounts of 
rth, Tag, or Tfl and increasing amounts of Vent were 
used to link and amplify segments of the p53 gene. 
Relative yields of the linked PQ product were 
quantitated using a Phosphorlmager (Molecular Dynamics) 
after 15 cycles. 
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Figure 2B shows the relative yields of PGR product 
of the F9 gene with varied ratios of Tth and Vent 
enzymes. Increasing amounts of Tth and Vent were used 
to link and amplify segments of the F9 gene. Again, 
relative yields of the linked PQ product were 
quantitated using a Phosphorlmager after 15 cycles 
(M=120ng ()>xl74/Hae III DNA marker) . 

Figure 3 demonstrates the effect of differing 
amounts of DNA template on the yield of PGR product. 
In Figure 3A, the p53 gene was amplified using Tth/Vent 
DNA polymerases. Tth/Vent DNA polymerases were applied 
with increasing amounts of template DNA and the p53 PQ 
product quantitated. In Figure 3B, Tth/Vent, Tag/ Pfu 
or Tfl/Vent DNA polymerases were applied to the F9 
gene. Tth/vent, Taq/Pfu or Tfl/Vent were applied to 
the F9 gene. 

Figure 4 presents the relative yields and 
accumulation of PGR product. Aliquots of identical 
radioactively labeled Linking PGR mixtures were removed 
from the thermocycler every 3 cycles from 9 to 30 
cycles and the PQ PGR product was quantitated. Forty 
nanograms (4A), 20ng {4B), or lOng (4G) of p53 DNA 
templates per 25 pi reaction were used. 

Figure 5 gives the relative yields of PGR products 
of the p53 and F9 genes with different annealing 
temperature. 

DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS 

The gene amplification method of the present 
invention can produce .large amounts of DNA composed of 
several exons or a DNA composed of several non- 
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contiguous DNA segments from the same gene or different 
genes, without requiring the time-consuming 
amplification of each separate gene segment. Another 
advantage of the present invention is the easy, one- 
step linkage of the gene segments and amplification of 
the linked product. 

The invention has multiple uses which include but 
are not limited to, the following: i) efficient 
scanning of mutations by methods such as restriction 
endonuclease fingerprinting when genomic DNA is 
analyzed from genes in which there are multiple short 
exons separated by long introns; ii) joining of 
different protein domains to generate a recombinant 
gene/RNA which has novel properties; and iii) linking 
RNAs together by generating cDNA, linking the cDNA with 
the primer that contains an RNA promoter sequence, and 
after linkage transcribing the linked segment to 
generate the RNA. 

Mnltiolex PGR 

Multiplex PGR is the amplification of the desired 
regions (for example, exons) of the genetic material 
using a pair of primers for each individual region. 
Figure 1 illustrates in schematic form the 
amplification of four exons simultaneously with four 
primer pairs prior to the linking step. The primer 
pairs are designated Di/Ui, D2/U2 ... Dp/Un. Each primer 
contains a GC-rich tail and a sequence-specific region, 
and the tail of each U primer is complementary to the 
tail of the subsequent D primer. Three types of primer 
tails are contemplated for use with the invention. In 
type I primers the tail of the D primer does not 
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overlap with the sequence-specific region of the 
previous U primer. In type II primers, the tail of the 
D primer overlaps the sequence-specific region. adjacent 
to the previous U primer. See Figure IB. Type III 
primers contain a tail portion which is complementary 
to a sequence internal to the previous gene segment. 
See Figure IC for an example in schematic form. 

Primers were designed with Oligo 5 software 
(National Biosciences, Inc.) and the GCG program 
{Genetic Computer Group, Inc.). Oligo 5 calculates 
primers' melting temperature (T„) by the nearest 
neighbor method at 50 mM KCl and 250 pM.DNA. the T„ ■ 
value of each PGR DNA was estimated by the Wetmur 
formula (T^P^^^uct ^81.5 + 16.6 log[r] + 0.41 (%G + %C) - 
675/length (Wetmur, 1991) - Type I tails do not overlap 
the sequence-specific region of the complementary 
primer, while type II tails overlap the sequence- 
specific region for four bases. Tails which overlap by 
more or fewer bases are also suitable for use with, the 
invention. Type III tails are complementary to an 
internal sequence within the previous gene segment. 

Guidelines for Primer Design 

Designing the appropriate primers is a critical 
step in successfully performing Linking PGR. Based on 
this work, building on other studies using Multiplex 
PGR (Liu et al. 1997), the following guidelines for 
primer design were developed and successfully applied. 

a. Sequence-specific region 

The sequence-specific region affects the yields 
and specificities of the Multiplex PGR. The criteria 
are set as follows: 
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1. The T„ value should be approximately 35**C 
below the average T„ value of the targeted regions. A 
T„ value lower than this may result in low PGR yields, 
especially if the region of the gene segment to be 
amplified contains a high GC percentage. 

2. The stringency for dimer or hairpin 
formation at the 3' end should preferably be set at ^4 
base pairs among all primers. This has the potential 
to cause a greater problem in Multiplex PGR than in 
ordinary PCRs using only two primers. 

3. The stringency for false priming sites at 
the 3' end should preferably be set at ^6 base pairs 
for all strands and for all regions, 

A . Internal stability may be chosen based on 
the instructions in the Oligo 5 software package, 
b. Tail region 

The tail is short (preferably less than 20 bases) 
and contains a high percentage of GC bases, which 
functions to provide consistent and balanced high 
yields of Multiplex PGR products, and an efficient and 
specific '"linker" for the Linking PGR. The criteria 
should be set as follows: 

1. The GG content should preferably be from 
60% to 70%, 

2. The tail size is preferably 10-15 bases 
long and most preferably 12 bases long. 

3. The stringency for false priming of the 
primer's antisense sequence at its 3* end should 
preferably be ^6 bases for any strand and any target. 

4. A type II tail is preferred. 
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Qptiimization of Multiplex PCR 

The parameters of the PCR may be optimized 
according to Shuber, at al. (Shuber et al., 1995) or 
determined empirically. For the following examples, 
the optimization strategy of Shuber, et al, (Shuber et 
al., 1995) was followed, except for the. annealing 
temperature. The optimal annealing temperature was 
determined empirically and was expected to be 
approximately 20-25°C below the average T„ of the gene 
regions being amplified. (Liu et al., 1997). A 
preferred strategy for optimization of the Multiplex 
PCR step is as follows: 

a. Test each PCR: 

Concentrations of primer, Mg, DMSO, and the amount 
of TagGold DNA polymerase should be optimized for each 
polymerase chain reaction. The optimal annealing 
temperature should be approximately 20'-25°C lower than 
the average T^^ of the regions to be amplified, but 
ultimately should be determined empirically. If a 
region is not being efficiently amplified, adding an 
additional one or two bases to the sequence-specific 
region of the primer may increase the yield. 

b. Test the Multiplex PCR: 

The common parameters of each PCR should be chosen 
to generate balanced high yields of the specific 
desired Multiplex PCR products. The Tag DNA polymerase 
may be present in amounts as high as 2-6 units per 25 
pi reaction. Rarely, the primer concentration may need 
further adjustment to achieve even, balanced yields of 
each DNA segment. If satisfactory results are still 
not achieved^ a change in the primer sequence may be 
necessary. 
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TaqGold with hot-start was found important to 
prevent primer dimer formation and false priming. One 
potential difficulty in the p53 gene was the 
amplifications of exons 10 and 11, which are separated 
by an intron of 800 base pairs. However our results 
showed no large PGR DNA spanning the two exons when the 
Tp, of the sequence-specific regions of the U3 and D4 
primers was increased and their relative concentrations 
adjusted according to our preferred optimization 
scheme . 

j^ApKinq PCR 

Amplified DNA segments produced by multiplex PCR 
or any other suitable method may be joined with the 
linking PCR method, with our without prior purification 
to remove unincorporated primers. First, the anti- 
sense strands of Uj and D2 tails, U2 and D3 tails, and U3 
and D< tails are annealed and extended, so the four 
DiUj, D2V2, D3U3, and D4U4 DNAs are linked into a D^U^ 
molecule in numerical order. If the primers are 
complementary to a different region of the DNA segment 
to be joined, the complementary regions are annealed 
and extended. Second, PCR amplifies the joined 
template with nested primers such as P and Q. {Figure 
lA) . 

Tails of 12-base size worked efficiently, although 
tails of 10-15 bases, or a greater range, are also 
suitable. The tails of primers P and Q prevent 
^^megapriming, " which occurs when a PQ product generated 
in an. earlier cycle acts as a primer for a larger DjU^ 
template in a subsequent cycle (Sarkar and Sommer, 
1992; Sarkar and Sommer, 1990) . Also, the tail acts as 
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a switch from low amplification efficiency to high 
efficiency, depending on which template of DjU^ or PQ to 
which the primer anneals (Liu, et al. 1997). 

One of three DNA polymerases lacking 3 '-5' 
exonuclease activity {Tth, Taq (Boehringer Mannheim) 
and Tfl (Promega)) were combined with one of two 
enzymes possessing 3*-^5' exonuclease activity [Vent 
(New England BioLabs) , Pfu (Stratagene) ) to perform the 
inventive method. The effect of the enzyme which lacks 
3 '-5' exonuclease activity is speculated to remove the 
potential extra non-template A base at the 3' end of 
the PGR product (Wu et al., 1989). Persons of skill in 
the art will recognize that other enzymes may be used 
with the present invention, such as Pwo and Plo, but it 
is key that the polymerase activity is due to one (or 
more) enzymes without 3'-»5' exonuclease activity and 
one (or more) enzymes with 3 '-5* exonuclease activity. 
These enzymes and enzyme combinations serve only as 
examples by way of illustration and are not intended to 
limit the invention. 

A solid or liquid macromolecular additive may be 
used in the linking PGR mixture. Macromolecular 
additives such as polyethylene glycol (PEG) may reduce 
the amount of template needed to obtain a satisfactory 
result. 

optj-mi^^tiPP of ][^j.nking PCR 

The following preferred parameters are not 
intended to limit the invention. Skilled molecular 
biologists will recognize that different parameters may 
be used with the invention. 

a. Primers 
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The nested primers P and Q should be designed 
using the same criteria and methods as described above 
for the D and U primers, except that the T^, of the 
sequence-specific regions are preferably approximately 
35-40''C lower than the D^U^ DNA product. 

b. Polymerases 

Ttb/Vent DNA polymerases are preferably present at 
approximately lU/O.lU or 1U/0.05U per 25 pi reaction, 

c. Linkage efficiency 

The linkage of individual DNA segments is 
preferably tested by measuring the linking efficiencies 
of all the regions desired to be linked, and all 
shorter linked segments. For example, if the desired 
complete DNA sequence is made up of 4 segments, the 
linkage efficiency of 4, 3, and 2 segments would be 
measured with the appropriate primer pairs. Table 1 
illustrates this suggested method. 



Table 1. Relative mole ratios of PGR products 
with different primer pairs 



Gene 


Template 
amount 


Primer pair 






D3/U4 








P/Q 


Hone 


p53 


40ng 


10,85 


8.10 


9.09 


6.73 


7.27 


4.81 


5.31 


0.22 


20ng 


7.74 


4.56 


7.60 


0.98 


3.19 


0.57 


1.00 


0.17 


lOng 


8.54 


4.44 


7.08 


1.07 


0.56 


0.09 


0.13 


0 


F9 


40ng 


18.54 


18.58 


17.60 


14.43 


14.27 


8.77 


4.87 


0.16 


lOng 


17.78 


17.60 


16.85 


8.18 


8.58 


3.61 


1.00 


0 



The mole ratio of DjUj, D2U3, D1U3, DjUj and no primer to D1U4 

primer is obtained by normalizing the relative yield by the 
potential amount of incorporated radioactive ^*P-dCTP. 
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d. Annealing temperature 

The optimal annealing temperature should be 
determined using a large amount of DNA templates, and 
is generally associated with the percentage of GC bases 
5 in the DNA templates. See Figure 5. 

e. DNA template concentration 

The preferred DNA template concentration is 
determined as shown in Figure 3, and is dependent on 
how many DNA segments are to be linked together. 
10 f- Cycle number 

The optimal cycle number for each Linking- PGR 
should be determined. Routinely, 20--25 cycles are most 
efficient and yield the best product. 

g. Quality control 
15 The identity and quality of the Linking PGR 

product is preferably confirmed by direct sequencing. 
If the product is not of the correct sequence, the 
tails of the primers from the multiplex step and the P 
and Q primers should be double-checked. 

20 EXAMPLES 

!• Multiplex PGR of p53 gene exons 

Each of four primer pairs of Dj/Uj, D2/U2, D3/U3, and 
D^/U^ {Table lA) were used to amplify exons Ir 2-4, 10 
and 11 in the p53 gene. Each primer contained a GG- 

25 rich tail and a sequence-specific region. The tails of 
the U primers were complementary to the tails of each 
subsequent D primer. This example used a type I tail, 
in which the tail of the D3 primer is not overlapped 
with the sequence-specific region of the U2 primer. 

30 (Table lA, Figure lA) . A hot-start at 92°G for 10 
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minutes was included for enzyme activation. The 
denaturation was at 9A^C for 15 seconds, and the 
annealing was at 55°C for 30 seconds, followed by 
elongation at 72°C for 2 minutes, for a total of 35 
5 cycles with a Peirkin Elmer model 9600 thermal cycler. 

The PGR mixture contained 50 mM KCl, 10 mM Tris/HCl, pH 
8,3, 1.5 mM MgCl2/ 200 pM of each dNTP, 5% DMSO, 3-4U of 
TaqGold DNA polymerase (Perkin Elmer), and 250ng of 
genomic DNA per 25pl of reaction. After purification 
10 in a Centricon^-100 microconcentrator (Amicon) , the 
amount of DNA was determined by spectrophotometer at 
260 nm. The four expected DNA products were obtained 
in similar molecular ratio, and the complementary tails 
did not cause obvious problems. 

15 2. Multiplex PGR of F9 gene exons 

Each of four primer pairs of Dj/Ui, Dj/Uj, D3/U3, and 
D4/U4 (Table IB) were used to amplify exons 1, 2-3, 4, 
and 5 of the F9 gene. As in the above example, each 
primer contained a GC-rich tail and a sequence-specific 

20 region, and the tails of the U primers were 

complementary to the tails of each subsequent D primer. 
This example uses a type II tail, in which the tail of 
the Di, D2/ and D3 primers overlapped for four bases the 
sequence-specific region of the corresponding U 

25 primers. (Table IB, Figure IB) . The PGR mixture and 
reaction parameters were the same as in example 1 with 
the exception that 5% DMSO was omitted from the 
reaction mixture. 



wo 99/16904 



PCT/US98/19968 



17 



Table 1. Primers 





# 


Name' 


Cone 
<PM) 


Sequence'^ 


Cc) 


CO 




1 


(5'UT) (750)28Dj 


0.05 




28.9 


46.1 




2 


(11) {995)290, 


0.05 




31.8 


44.9 


p 

5 


3 


(ID (11641) 30D, 


0.1 




31.8 


43.3 


4d 


(14) (12352)300, 


0.1 




30.9 


51.5 


3 


5 


(19) (17480)310, 


0.06 


actataccatcqTCCCTrATAAAriTraziArfl 


L 30.9 


43.2 




6 


(110) (17741) 31U3 


0.2 


aaaatqqatqt(;;r:rTATr:r:rTTTrrAArrTfi 


i 21.6 


46.9 




7 


(110) (18547) 310, 


0.2 


Qacacccacrt-rArrrTrTPArTr'ATrtT^ftT 


21.6 


42." 




8 


(111) (18756) 29U4 


0.06 


CCCOtaaqqarpriArrrA^AA'^'^'^AAAPT 


29,2 


43.9 




9 


(5*UT) (769)29D(P] 


0.05 


caacqqqt^r^qqAar.anTrr.Arr.qqp.'r'- 


29.5 


43.3 




10 


(111) (18717)29U(Q] 


0.05 


cotataqqtqctGAGr.r.Aar.rTr.Triir,Tr, 


29.6 


44.6 




A 
w 




Cone 
(pM) 


SeqfuencG** 




CT/ 
("C) 




11 


(5 UT) (-102) 250, 


0 . 1 


tcacaaaqGAGGCCATTnnAAAT^ 


-7.2 


41.9 




12 


(111 (2481 25U, 


0. 1 


ataaacgqtQGTOC'VGc:cTc:jjj\Qj\ 


30 . " 


4 1 . c 




13 


(11) (6104)290; 


0.2 


cacGaccqcTTACrGGAArrCTCTrGACT 


30." 


40.2 


F 


14 


(13) (6870)27U2 


0.2 


accaacaqTGGpATAAr.r.rrnTAcijAT 


29.6 


41.5 


9 


15 


(14) (10264)250, 


0.14 


accactqt 7»(?(;rTTrrAr,r:Trnn:Tn 


?9.c 


43.1 




16 


(14) {10618)28Uj 


0.14 


aGLCcqqqATCAAAGGTATr.TrjTTAA^- 


32.5 


38.5 




17 


(14) (17584)270, 


0.1 


taa tcccqG;\c:cc.ATArArnAnTrAnr 


12.2 


40.4 




18 


(15) (17897)270, 


0.1 


acqgaqs^(pAGnAAr;rAr;ATTrn2\/^7AG 


-12.0 


40.6 




19 


(5'UT) (-57)23D(P] 


0.05 


tc^aqq^qf;Af:r:r;Ar;ATr:r:ArAT * - 


-17.4 


33.6 




20 


(15) (17856)230[Q] 


0.05 


tqgtqtqQTTAAAATnrTrzAAnT 


■17.1 


27.2 



'p53 gene PGR products are DjUj (270 bp, 58% G+C), 0.0- (736 
bp, 59% G+C), D3O3 (286 bp, 55% G+C) , 0,0, (235 bp, 54% G+C)"and PQ 
(1433 bp, 57% G+C). The numbering system is based on GenBank 
Accession: X54156. F9 gene PGR products are D,Oi (367 bp, 40.6% 
5 G+C), 0,02 {784 bp, 31.4% G+C), O3O3 (371 bp, 39.9% G+C), D,0. {330 
bp, 36.4% G+C) and PQ (1702 bp, 35% G+C). The numbering system is 
as described in Yoshitake, et al. (Yoshitake, et al., 1985;. Kev 
to primer names: 5'UT indicates the primer begins at a 5* 
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untranslated region; the letter I followed by a number indicates 
the primer begins at that intron; the number in parenthesis 
indicates the nucleotide at which the sequence begins; the 
following number indicates the sequence length; and the letter D 
5 or 0 indicates a downstream or upstream primer. 

*The underlined region is the tail and the capitalized region 
is the sequence-specific region, 

*'tTa and cT^ represent the values of the tail and the 
sequence-specific region of a primer, respectively, 
10 **The anti-sense sequences of primer #4 have 7 bp false 

priming sites at the 3' end. 
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3. Linking PGR 

Linking PGR was performed as follows. 
Denaturation proceeded at 94''C for 15 seconds, annealing 
at 55°C for 30 seconds rammed to 72°C within one minute, 
and then elongation at 72°C for 2-3 minutes, for a total 
of 15 cycles. The mixture contained 100 mM KCl, 10 roM 
Tris/HGl, pH 8.9, 1.5 mM MgClz, 50 pg/ml BSA, 0.05% 
(v/v) Tween 20, 200 uM of each dNTP, lU of Tth 
(Boehringer Mannheim) and 0.1 U of Vent (New England 
Biolabs) DNA polymerases, 20ng each of the four DNAs, 
and 5 pCi of alpha-^^P-dCTP (300 Ci/mmol, Amershar) per 
25 pi reaction; unless mentioned elsewhere. The PGR 
products were separated on a 2% agarose gel, which was 
then stained with ethidium bromide and UV photographed 
with an Alpha Imager*^" 2000 CCD camera (Alpha Innotech) . 
The PGR was quantitated by Phosphorlmager with 
ImageQuant software (Molecular Dynamics) after the 
dried gel was exposed for 30 minutes. The PGR yields 
were quantitated as ''random units," i.e. the number of 
pixels in the PGR band minus the background. 

To quantitate the accumulation of linked PQ PGR 
product, aliquots of the Linking PGR reaction mixture 
were removed. from the thermocycler every 3 cycles from 
9 cycles to 30 cycles. Reactions containing 40ng, 
20ng, and lOng of the four p53 DNA templates per 25 pi 
reaction volume (Figure 4) were used. The first 
appearance of the faint PQ product was dependent on the 
amount of DNA template in the reaction mixture, 
supporting the existence of four-component linking 
kinetics. During the later cycles, the PQ PGR product 
accumulated to a considerable extent, and reached a 
saturation point- The point at which saturation was 
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reached was also dependent on the amount of DNA 
template originally added to the reaction. 
Furthermore, the relative presence of intermediate 
products was greatly reduced after 20 cycles (Figure 
5 4) . A similar result was obtained with the F9 gene. 

4, Optimization of Linking PGR parameters 

a. Enzyme type, concentration, and ratio 

As seen in Figure 2A, lU of Tth, Tag, or Tfl 
was mixed with 0-0. 2U Vent to test the yield of p53 

10 gene PGR product as quantitated by Phosphorlmager after 
15 cycles under various enzyme conditions. The results 
show that Tth/ Vent in ratios of 1:0.1 and 1:0.05 (lanes 
2 and 3) generated the highest yield. Relative linking 
PGR efficiencies were Tth/Vent or Tfl/Vent > Tfl/Pfu > 

15 Taq/Pfu > Tth/Pfu > Tag/ Vent. Any single enzyme alone 
did not work optimally. 

Further tests with Tth and Vent were performed in 
linking exons of the F9 gene, changing both the amounts 
and ratio of the two enzymes in the reaction (Figure 

20 2B) . Amounts of Tth ranged from 0.125U to 4U, and 

amounts of Vent ranged from 0.0125U to 0.4U, with 4U 
rth and O.IU Vent generating the highest yield (lane 
7) . The results show that both the absolute amount of 
the two enzymes and the ratio influence the efficiency 

25 of the linking polymerase chain reaction. Similar 

results were achieved when linking PGR was performed 
with segments containing 15 base pairs of complementary 
sequence and when 49ng of template DNA was used. 
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b. DNA template concentration 

Using increasing amounts of each of the four DNA 
templates (DiUi, DjUzf and with a constant 

amount of Tth/Vent, a ^^threshold" between 10 and 20ng 
5 template DNA per 25 pi reaction volume (lanes 4 and 5) 
was noted. Doubling the template concentration 
resulted in a 14-16-fold increase in linked p53 
product. The yield of linked product is therefore the 
template concentration to the fourth power (yield = 

10 (total amount of all template)*). This was confirmed by 
repeating the experiment using separate steps of 
linking and subsequent amplification. In addition, an 
experiment using DNA templates with longer 15-base 
tails produced the same effect. Similar ''threshold" 

15 effects occurred when Taq/Pfu and Tfl/Vent enzymes were 
applied to the F9 gene (40% GC content rather than 57% 
GC content as in the p53 gene), indicating the effect 
is not dependent on either the enzymes or the 
particular DNA templates used (Figure 3B) . 

20 c. Linkage with different primer pairs 

Besides the primer pairs of P/Q, other primer 
pairs were compared: DjUg, D2U3, and D3U4 amplified two 
linked regions; DjUj and D2U3 amplified three linked 
regions,, and D1U4 amplified four linked templates, 

25 respectively. The mole ratio of DjUj/ D2U3, and D3U4; of 
D1U3 and D2U3; and of D^U^ (the normalized relative yield 
or number of potential incorporated radioactive ^^P- 
dCTP) reflect the relative linking efficiencies of 
two-, three-, and four- template reactions. Table 2 

30 shows that Linking PCRs linking two templates are much 
more efficient than those linking four templates. 
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Al SO/ Lxnklng PCRs li.nki.n9 two tiempldties dre much l6ss 
dependent on the template amount • 

d. Tail length 

Tails of 10, 12, or 15 bases, designed to contain 
5 60-70% GC, were tested linking exons of the F8 gene. 
Tails containing 12 bases were most efficient at 
linkage. Further experiments with 12- and 15-base 
tails in the p53 gene (T„ ranging from 21.6°C to 44.1°C) 
and 12-base tails in the F9 gene (T„ ranging from 29.6''C 
10 to 32.3°C) yielded the same results: 12-base tails were 
most efficient; 

e. Annealing temperature 

The effects of annealing temperature were studied 
using a Gradient Robocycler (Stratagene) . For the p53 

15 gene, using four templates with 12-base tails, linked 
product was formed with high yields at annealing 
temperatures from *50°C up to 58°C. For the F9 gene, 
under the same conditions, high yields of linked 
product were formed at annealing temperatures from 47®C 

20 up to 55°C. The optimal annealing temperature is 
relatively low and has a broad range. The optimal 
annealing temperature also is associated with the GC 
content of the templates.. 
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CLAIMS 

I Claim: 

!• A method of linking by PGR DNA segments which 
occur in ncn--adjacent portions of target DNA wherein 
each DNA segment contains a sequence complementary to a 
sequence in the DNA segment or segments to which it is 
to be linked, comprising using 

a. a first primer which is complementary to the 
antisense strand of the first DNA segment to 
be linked and a second primer which is 
complementary" to the sense strand of the last 
DNA segment to be linked; and 

b. at least one polymerase lacking 3* -5' 
exonuclease activity and at least one 
polymerase containing 3 '-5' exonuclease 
activity. 

2. The method of claim 1, wherein the at least 
one polymerase lacking 3 '-5' exonuclease activity is 
selected from the group Tth, Tag, and Tfl. 

3. The method of claim 1, wherein the at least 
one polymerase containing 3 '-5' exonuclease activity is 
selected from the group Pfu, Plo, and Pwo. 

4. The method of claim 1, wherein the at least 
one polymerase lacking 3 '-5' exonuclease activity is 
present in a concentration of from about 0.125U to 
about 4U per 25 pi reaction. 
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5. The method of claim 1, wherein the at least 
one polymerase lacking 3'-*5* exonuclease activity is 
present in a concentration of 4U per 25 ]xl reaction. 

6. The method of claim 1, wherein the at least 
one polymerase containing 3 '-5' exonuclease activity is 
present in a concentration of from about 0.0125U to 
about 0.4U per 25 pi reaction. 

7. The method of claim 1, wherein the at least 
one polymerase containing 3 '-5' exonuclease activity is 
present in a concentration of 0 . l-U per 25 \il reaction, 

8. the method of claim 1, wherein the at least 
one polymerase lacking 3 '-5' exonuclease activity and 
the at least one polymerase containing 3 '-^5' 
exonuclease activity are present in a ratio of from 
about 1:0.0125 to about 1:0.2. 

9. The method of claim 1, wherein the at least 
one polymerase lacking 3 '-^5* exonuclease activity and 
the at least one polymerase containing 3'-*5' 
exonuclease activity are present in a ratio of 1:0.05, 

10. The method of claim 1, wherein the at least 
one polymerase lacking 3 '-5' exonuclease activity and 
the at least one polymerase containing 3* -5' 
exonuclease activity are present in a ratio of 1:0.1. 

11. The method of claim 1, wherein the DNA 
segments are amplified using a first primer having a 3' 
portion which is complementary to the 3' end of the 
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antisense strand of the DNA segment and a 5' tail which 
is complementary to the 5' end of the second primer for 
the previous DNA segment; and a second primer having a 
3' portion which is complementary to the 3' end of the 
sense strand of the DNA segment and a 5' tail which is 
complementary to the 5' end of the first primer for the 
subsequent segment. 

12. The method of claim 1, wherein the DNA 
segments are amplified using a first primer having a 3' 
portion which is complementary to the 3* end of the 
antisense strand of the DNA segment and a -5' tail which 
is complementary to a sequence internal to the previous 
DNA segment; and a second primer having a 3' portion 
which is complementary to the 3' end of the sense 
strand of the DNA segment and a 5' tail which is 
complementary to a sequence internal to the subsequent 
segment. 

13- The method of claim 1, wherein the DNA 
segments are exons of a single gene. 

14. The method of claim 1, wherein the DNA 
segments are exons of different genes. 

15. The method of claim 1, wherein the DNA 
segments are nonexon portions of a single gene. 

16. The method of claim 1, wherein the DNA 
segments are nonexon portions of different genes. 
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17. The method of claim 1, wherein the DNA 
segments originate from organisms of the same species. 

18. The method of claim 1, wherein the DNA 
segments originate from organisms of one or more 
different species. 

19. The method of claim 1, wherein the DNA 
segments contain tails which are complementary to the 
tails of adjacent DNA segments. 

20. A method of producing and amplifying DNA • 
containing at least three linked DNA segments which 
occur in non-adjacent portions of target DNA,. 
comprising 

a. providing a first primer and a second primer 
for each DNA segment to be amplified, 

i. the first primer (termed the D primer) 
having a 3' portion which is 
complementary to the 3 ' end of the 
antisense strand of the DNA segment and 
a 5' tail which is complementary to the 
5' end of the second primer for the 
previous DNA segment; 

ii. the second primer (termed the U primer) 
having a 3' portion which is 
complementary to the 3' end of the sense 
strand of the DNA segment and a 5* tail 
which is complementary to the 5' end of 
the first primer for the subsequent 
segment; 
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b. amplifying the at least three DNA segments by 
multiplex PCR using the pairs of first and 
second primers; and 

c. subjecting the at least three amplified DNA 
segments to a linking PCR using a sense 
primer which is complementary to the 
antisense strand of the first segment to be 
linked and an antisense primer which is 
complementary to the sense strand of the last 
segment to be linked. 

21. A method of producing and amplifying DNA 
containing at least three linked DNA segments which 
occur in non-adjacent portions of target DNA, 
comprising 

a- providing a first primer and a second primer 
for each DNA segment to be amplified, 

i. the first primer (termed the D primer) 
having a 3' portion which is 
complementary to the 3' end of the 
antisense strand of the DNA segment and 
a 5' tail which is complementary to a 
sequence internal to the. previous DNA 
segment; 

ii. the second primer (termed the U primer) 
having a 3' portion which is 
complementary to the 3' end of the sense 
strand of the DNA segment and a 5* tail 
which, is complementary to a sequence 
internal to the subsequent segment; 
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b. amplifying the at least three DNA segments by 
multiplex PGR using the pairs of first and 
second primers; and 

c. subjecting the at least three amplified DNA 
segments to a linking PGR using a sense 
primer which is complementary to the 
antisense strand of the first segment to be 
linked and an antisense primer which is 
complementary to the sense strand of the last 
segment to be linked. 

22. The method of claim 20, wherein after step b 
and before step c unincorporated primers are removed 
from the amplification reaction mixture. 

23. The method of claim 21, wherein after step b 
and before step c unincorporated primers are removed 
from the amplification reaction mixture. 

24. The method of claim 20, wherein the 5' tails 
do not overlap the 3' portion which is complementary to 
the 3* end of the sense strand of the DNA segment . 

25. The method of claim 20, wherein the 5' tails 
overlap the 3' portion which is complementary to the 3' 
end of the sense strand of the DNA segment. 

26. The method of claim 20, wherein the 5* tails 
overlap the 3' portion which is complementary to the 3* 
end of the sense strand of the DNA segment by four 
nucleotides . 



wo 99/16904 



PCT/US98/19968 



29 



27. The method of claim 20, wherein the 5* tails 
are GC-rich. 

28. The method of claim 20, wherein the 5* tails 
contain about 60% to about 70% G and C nucleotides. 

29. The method of claim 20, wherein the 5* tails 
are about 10 to about 15 nucleotides long. 

30. The method of claim 21, wherein the 5' tails 
are about 10 to about 15 nucleotides long. 

31. The method of claim 20, wherein the 5' tails 
are 12 nucleotides long. 

32. The method of claim 21, wherein the 5' tails 
are 12 nucleotides long. 

33. The method of claim 20, wherein the DNA 
segments are exons. 

34. The method of claim 21, wherein the DNA 
segments are exons. 

35. The method of claim 20, wherein the DNA 
segments are exons of a single gene. 

36. The method of claim 21, wherein the DNA 
segments are exons of a single gene. 



37. 

segments 



The method of claim 20, wherein the DNA 
are exons of different genes. 
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38. The method of claim 21, wherein the DNA 
segments are exons of different genes . 

39. The method of claim 20, wherein the DNA 
segments are nonexon portions of a single gene. 

40. The method of claim 21, wherein the DNA 
segments are nonexon portions of a single gene. 

41. The method of claim 20, wherein the DNA 
segments are nonexon portions of different genes. 

42. The method of claim 21, wherein the DNA 
segments are nonexon portions of different genes. 

43. The method of claim 20, wherein the DNA 
segments originate from organisms of the same species. 

44. The method of claim 21, wherein the DNA 
segments originate from organisms of the same species. 

45. The method of claim 20, wherein the DNA 
segments originate from organisms of one or more 
different species. 

46. The method- of claim 21, wherein the DNA 
segments originate from organisms of one or more 
different species . 

47. The method of claim 20, wherein the linked 
DNA product is a copy of a gene lacking large introns. 
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48. The method of claim 21, wherein the linked 
DNA product is a copy of a gene lacking large introns. 

49. The method of claim 20, wherein the linked 
DNA product contains a mutation. 

50. The method of claim 21, wherein the linked 
DNA product contains a mutation. 

51. The method of claim 20, wherein the linking 
polymerase chain reaction mixture contains a solid or 
liquid macromolecular additive. 

52. The method of claim 21, wherein the linking 
polymerase chain reaction mixture contains a solid or 
liquid macromolecular additive, 

53. The method of claim 20, wherein the 
macromolecular additive is polyethylene glycol. 



54. The method of claim 21, wherein the 
macromolecular additive is polyethylene glycol. 
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CYCLES 



LANE M 1 
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B20ng 



C10ng 
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SEQUENCE LISTING 

<110> LIU, QIANG 

<120> EXON-LINKING FOR dNA BASED DIAGNOSTICS 

<130> 2124-292 

<140> 
<141> 

<150> US 60/060319 
<151> 1997-09-29 

<160> 24 

<170> Patent In Ver. 2.0 - beta 

<210> 1 

<211> 28 

<212> DNA 

<213> Homo sapiens 

<400> 1 

gttcgcagag ggtttgtgcc aggagcct 

<210> 2 

<211> 29 

<212> DNA 

<213> Homo sapiens 

<400> 2 

aggacgaccg ctagcccgtg actcagaga 

<210> 3 

<211> 30 

<212> DNA 

<213> Homo sapiens 

<400> 3 

agcggtcgtc ctccagggtt ggaagtgtct 

<210> 4 

<211> 30 

<212> DNA 

<213> Homo sapiens 

<400> 4 

cgatggcaca gcgatacggc caggcattga 

<210> 5 

<211> 31 

<212> DNA 

<213> Homo sapiens 

<400> 5 

gctgtgccat cgtccgtcat aaagtcaaac a 
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<210> 6 

<211> 31 

<212> DNA 

<213> Homo sapiens 

<4 00> 6 

gaggtgggtg tccctatggc tttccaacct a 

<210> 7 

<211> 31 . 

<212> DNA 

<213> Homo sapiens 

<400> 7 

gacacccacc tcaccctctc actcatgtga t 

<210> 8 

<211> 29 

<212> DNA 

<213> Homo sapiens 

<400> 8 

cccgtgagga cagacccaaa acccaaaat 

<210> 9 

<211> 28 

<212> DNA 

<213> Homo sapiens 

<400> 9 

caacgggtca ggaggggttg atgggatt 

<210> 10 

<211> 29 

<212> DNA 

<213> Homo sapiens 

<400> 10 

cgtgtgggtg ctgagggagg ctgtcagtg 

<210> 11 
<211> 24 
<212> DNA 

<213> Homo sapiens 
<400> 11 

tcgcagagga ggccattgga aata 

<210> 12 

<211> 25 

<212> DNA 

<213> Homo sapiens 

<400> 12 

gtaagcggtc gtgctggctg ttaga 

<210> 13 

<211> 29 

<212> DNA 

<213> Homo sapiens 

<400> 13 

cacgaccgct tactggaatt ctcttgact 



wo 99/16904 



PCT/US98/19968 



<210> 14 

<211> 27 

<212> DNA 

<213> Homo sapiens 

<400> 14 

gccaacagtg gcataaccct gtagtat 27 

<210> 15 

<211> 25 

<212> DNA 

<213> Homo sapiens 



<210> 16 
<211> 28 
<212> DNA 

<213> Homo sapiens 
<400> 16 

ggtccgggat caaaggtatg tttttaag 28 

<210> 17 
<211> 27 
<212> DNA 

<213> Homo sapiens 



<210> 18 

<211> 27 

<212> DNA 

<213> Homo sapiens 

<400> 18 

acggagacag gaagcagatt caagtag 27 
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